Dynamical networks reconstructed from time series 
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Abstract. Novel method of reconstructing dynamical networks from empirically measured time series is proposed. By exam- 
ining the variable-derivative correlation of network node pairs, we derive a simple equation that directly yields the adjacency 
matrix, assuming the intra-network interaction functions to be known. We illustrate the method on a simple example, and 
discuss the dependence of the reconstruction precision on the properties of time series. Our method is applicable to any 
network, allowing for reconstruction precision to be maximized, and errors to be estimated. 
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1 Introduction 

Complex systems are ubiquitous in nature. On all scales from genes to societies, various systems are composed of many 
units, which are able to collectively perform complicated tasks despite their simplicity (TJ. In the recent years, the framework 
of complex networks was recognized as an excellent formalism for studying complex systems. By representing units as nodes 
and modeling their interactions as links [2], science of complex networks introduced graph analysis methods into physics, 
biology, engineering and even sociology [ 3 1 . This allowed for a variety of real and artificial complex systems to be extensively 
examined, typically via computational modeling [4|. Crucial aspect of a complex network is its structure, i.e. the topology 
of connections among its nodes. Properties of network structure dictate its global behavior, and are key to understanding the 
network's functioning and potentials for its control. For simple oscillator models, profound intertwinement between network 
structure and network dynamics was recently shown |5] . 

Since the structure of many natural networks is only partially known, it is of central interest to develop methods for recon- 
structing the network structure from the available empirical information. Various experimental techniques in this directions 
are already in use, specially in the context of gene regulation networks (6). In addition, a range of mathematical results is 
available [7 |. Recently, the topology of a social network was inferred using mobile phone data (S). Invasive reconstruction 
methods involve perturbing the network dynamics which allows for structural data to be easily extracted [9|. Although inva- 
sive methods generally give good results, it is often unpractical to interact with the on-going network dynamics. Non-invasive 
reconstruction methods focus on investigation of the observable network outputs, such as the time series quantifying the 
system's dynamics iflOl fTTl fT2l [131 . The relevance of non-invasive approach is increasingly recognized, particularly due its 
suitability for detecting links in biological networks lfl"2"l[T3"l . Alternatively, reconstruction methods also rely on techniques 
from control theory fBl . and even compressive sensing [ 15 1. 

In this contribution, we propose a novel network reconstruction method, based on examining the correlations between 
the variables and the derivatives corresponding to different nodes. Our central assumption is the precise knowledge of the 
functional forms of the intra-network interactions. As we show, depending on the quantity of network information contained 
in the empirical data, our method can give very precise results even for very short time series, thus being suitable for actual 
experimental situations. Apart from being non-invasive, our method is conceptually very simple, and easy to numerically 
implement. In contrast to a recent result based on the same hypothesis ifTOl . our method avoids solving the overdetermined 
linear system, and allows for the reconstruction error to be estimated. 



2 The Reconstruction Model 

We consider a complex system composed of N interacting units, which we represent as a network with N nodes, whose links 
model the interaction between the nodes. Each node is assigned a dynamical state defined by the real variable X{ = Xi(t), 
where i = 1,...N. We assume to be in the possession of empirically obtained discrete-time trajectories Xi(t m ) which 
describe the system's dynamical evolution over a certain time interval. The available data consists of N sequences, each 
containing L values Xj(ti), . . . Xi{ti,). The measurements of xi are separated by the observation interval 5 t = t m +i — t m , 
which defines the resolution of the time series (sampling frequency). Time interval St is uniform and assumed smaller than 
the characteristic dynamical time scale. 



We further assume the time-evolution of the node i to be given by: 



N 

Xj = h(xj) + Ajjfcjxj) , (1) 
i=i 

where we describe the local dynamics via function and the network (inter-node) interaction by the function fc- The 

network structure is encoded in the adjacency matrix A^, whose element ij specifies the strength with which the node i acts 
on the node j. The dynamics of the node i is a cumulative effect of its local dynamics and the sum of contributions from its 
networks neighbors that come with different strengths. Finally, we also assume that both interaction functions /j, and fc are 
precisely known. 

We seek to reconstruct the network's adjacency matrix under the named assumptions - by having the "fingerprint" 
of system's behavior, we attempt to reveal its structure. The above assumptions on which we build our theory are realistic. 
Many natural systems are modeled using Eq[TJ examples include gene regulation and neural interactions, for which the 
interaction functions are widely investigated, and do not vary with network links. Modern experimental techniques allow for 
high-resolution measurements of quantities such as gene expression, although thus obtained time series are typically short. 

When examining inter-dependence between dynamical quantities, one is typically interested in calculating the correlation 
between two dynamical variables. Inspired by this, we construct our theory based on investigating the correlation between a 
variable (xj) and the derivative of another variable (ij). We hence examine the correlation between the motion of the node i, 
and the speed of node j. We start by defining the following matrices: 

Bij — (X%Xjj , 

Cij = (xif L (xj)) , (2) 
Eij = (xifc[xj)) , 

where (•) denotes the time-average of a dynamical quantity (i.e., average over the recorded time-evolution) (h) = j- Yl m =i h(t r . 
This allows for EqQ]to be re-written in the matrix form: 

Aij = (Bik — Cik) ■ E h ^ , (3) 

which is our main network reconstruction equation. We introduce a new set of time points: 



so that 

Xi {Tn 

and accordingly: 

fL,c(T m ] 



, m = 1, . . . , L — 1 

_ 2-z(^m+l) Xi(tjri) 

= St ' 

fL,c(tm+l) + fL,c(t m ) 



This provides a more stable estimation of both interaction function values and the derivative values. We will rely on this 
calculation scheme for the implementation of our theory through Eqj3] Note that in principle, our method is applicable to any 
network, and the reconstruction is precisely correct in the limit of very long time series. However, since the empirical data 
are not only finite, but typically very short, our method will in general yield an approximate reconstruction. 

To discern from the original adjacency matrix A^, we term the reconstructed adjacency matrix Rij, and quantify the 
matrix reconstruction error as follows: 
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Natural test to make for each obtained Rij is to quantify how well does it reproduce the original empirical data Xi(t m ). To 
achieve this, we apply the following procedure: start the run from Xi(tx) for all nodes, and run the dynamics using adjacency 
matrix for the time interval St, i.e. until the time t%. Denote thus obtained values yifa), re-start the run from Xiiti) 
running until t%, accordingly obtaining yi{tz), and so on. The discrepancy that the time series yi(t m ) show in comparison to 
Xi(t m ) is the most straightforward measure of the reconstruction precision for matrix Rij. We name it trajectory error At, 
and define as follows: 




This way we measure point-by-point exactness of the reconstructed trajectory, which quantifies how well does it conform to 
the empirical data. Small A^ is necessary, but not sufficient for a good reconstruction - easily reproducible time series (such 
as periodic orbits) always display very small At regardless of A^, since many different networks can produce such data. On 
the other hand, hardly reproducible time series (such as transient or chaotic orbits) may show large A^ that does not imply 
large Aa- However, as we show, two errors are in general related, allowing for estimation of A^ based on Rij and At only. 



3 Results 



We test our reconstruction method using a simple illustrative example. A network with N = 6 nodes is constructed by placing 
L = 17 directed links between randomly chosen pairs of nodes, while requiring the resulting network to be connected. Links 
are weighted with positive and negative weights, uniformly selected at random from [—10, 10]. The studied network is 
illustrated in Fig[TJ and its adjacency matrix Aij is shown in Figf3^. The dynamics is defined on the network via Hansel- 




Figure 1 : Graphical representation of the studied network. Link thickness illustrates the interaction strength. Red/blue link 
colors (light/dark shades) indicate positive/negative inter-node interactions. Nodes are numbered in accordance with the 
adjacency matrix shown in Fig|3^. 



Sompolinsky model 1 16 1 by putting //, = —x and fc = tanhx in Eq[TJ The complete dynamics on network reads: 



Xi = —Xi + y Aji tanh(xj) 



(4) 



For each node we randomly select an initial condition from [—1, 1], and numerically integrate Eq]4]from time t — to t = 3. 
During the run, we store 15 values for each Xi, equally spaced in time, starting with Xi(t% = 0). Thus obtained time series 
for all nodes are shown in Figf2] 
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Figure 2: Time series for all 6 nodes for network Fig[T] obtained for the first set of initial conditions. 

We assume now these time series to be obtained from an "external" source (e.g. coming from experimental measurement), 
and seek to employ them to reconstruct the network's adjacency matrix as discussed in the previous Section. To this end, we 
numerically compute the matrices B^, CV, and Eij, and obtain the reconstructed adjacency matrix Rij via Eqf3] The result 
is shown in Figf3]- the original Ay in (a), the reconstructed in (b), and link-by-link comparison of and in (c). 
The reconstructed matrix R^ reasonably well approximates the original Aij, both for zero and non-zero weights. The matrix 
error is = 0.18, and the trajectory error is At = 0.038, indicating a good reconstruction precision. 

We now run another simulation of our dynamical system EqUwith the same underlying network, but this time starting 
from a different set of initial conditions. A new set of time series of equal size and resolution is obtained and shown in FiglH 
from which we seek to reconstruct our network again. The new results are shown in Figj5] and organized in analogy with 
Fig]3] The new R^ has matrix error = 0.56 and a trajectory error of At — 0.05, which is considerably worse than in 
the previous example, as it can also be clearly seen by comparing Figf3j: and Fig|5};. 

Despite that both sets of time series were produced by the same dynamical network, two reconstructed networks are 
different. This shows that besides depending on the length and resolution of the time series, the reconstruction precision 
crucially depends on the "quality" of time series as well, i.e. on the quantity of network information contained in them. 
Easily reproducible data contains less information than hardly reproducible data. As just illustrated, a given dynamical 
network can yield different time series depending on the initial conditions. Is there a relation between the two errors that 
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Figure 3: Original A^ and reconstructed Rij adjacency matrices in (a) and (b) respectively. Colorbar (shade) indicates the 
weights obtained from time series Figj2] Link-by-link weights comparison of A^ (circles) and Rij (crosses) in (c). Matrix 
error = 0.18, trajectory error At = 0.038. 
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Figure 4: Time series for all 6 nodes for network FigQ] obtained for the second set of initial conditions. 




Figure 5: Original A^ and reconstructed Rij adjacency matrices in (a) and (b) respectively. Colorbar (shade) indicates the 
weights obtained from time series Fig0] Link-by-link weights comparison of Aij (circles) and Rij (crosses) in (c). Matrix 
error A^ = 0.56, trajectory error At — 0.05. 



could be used to estimate A^ based only on At, independently on the "quality" of time series? The final Section of this 
paper is devoted to providing at least a preliminary answer to this question. 



4 Discussion 

The proposed reconstruction method in principle applies to any network whose inter-node interactions can be described via 
EqUJ The final reconstruction precision depends on a number of factors: (/) length and resolution of time series, also related 
to the precision of derivative estimates; (if) quantity of network information contained in the empirical data, which can be 
seen as reproducibility of the time series, or coverage of the dynamical phase space with data; (iif) invertibility of the matrix 
Eij\ and finally, (iv) properties of the network itself - some networks can be more reconstructable than others. In a concrete 
reconstruction problem, it is difficult to isolate how much each factor contributes to A a- Instead of quantifying this, we show 
a generalization of our method, done towards improving the reconstruction precision and estimating reconstruction errors. 

Our method is based on calculating the correlations between the variable Xi and other terms, as defined in Eqf2] More 
generally, we can replace Xi by g(xi), where g is an arbitrary function, without changing the main result. Eqf2]now becomes: 



= (9(xi)ij) , 

= {g{xi)f L {xj)) , 



(5) 



where notation Bn indicates that the matrix Bij was calculated via Eq|5]using function g. Eq[3j which now reads: 

B@ = (Blf - Clf) -4f-\ (6) 

still holds for any function g. As before, in the limit of very long time series, the reconstruction is precisely correct for any 
choice of g. For realistic scenarios involving very short time series, the reconstruction precision will depend on g, as two 
different g-s will in general yield two different R^-s. This means that g plays the role of a tunable parameter, which can be 
used to find the best reconstruction. By considering a set of functions g, we can compute for each of them, and define 

as the best reconstruction that Ry whose reconstructed dynamics shows minimal A^ (we generalize matrix and trajectory 
errors to depend on g). A good choice of g will extract more extractable network information hidden in the empirical data, 
and improve the simple reconstruction for g(x) = x. Moreover, variations of R^ with g are related to the reconstruction 
precision - for a reliable reconstruction, the obtained R^j will not strongly depend on changes of g. On the other hand, a 
bad reconstruction will be recognized by a drastic dependence of R^ on g. Note that the functional properties of g itself are 

irrelevant - the only role of g is the computation of R^' . 

To illustrate the implementation of our generalized method, we examine again the second set of time series shown in 
Fig|4] We consider the set of functions g(x) — x n , where for n we take integers between -20 and 20 (except 0). The network 
is reconstructed using Eq|6]for each g, and the corresponding and A^ are calculated. The results are shown in FigO 
where each Rf^ is represented through its A^ and A^ . There is a visible correlation between the two errors, suggesting 
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Figure 6: Reconstructions using many functions g. Each point is specified by Ajr and A)f corresponding to Rn obtained 
via one of g(x) — x n for n integer between -20 and 20 except 0. Points obtained for g(x) — x and g(x) = x 19 are indicated. 

that smaller A^ , on average, leads to a smaller A^'. Following this principle, for function g(x) = x 19 we find the smallest 
trajectory error A^ — 0.02 leading to A^' = 0.11. As indicated in Fig|6] this result is much better than what obtained 
for g(x) = x. In addition, this result is better than the one found for time series from Figf2] For comparison, we show the 
reconstruction for g{x) = x 19 in Fig|7] Note however, that this is not the best result in terms of A^': for g{x) — x~ 6 we 




Figure 7: Original Aij and reconstructed R\^ adjacency matrices in (a) and (b) respectively. Colorbar (shade) indicates the 
weights obtained from time series FigJU using the function g(x) = x 19 . Link-by-link weights comparison of (circles) 
and R^f (crosses). Matrix error A^ =0.11, trajectory error A^ } = 0.02. 

find A^ 9) = 0.10, that however we missed since it has bigger trajectory error A^ = 0.033. This indicates that considering 
few g-s with the smallest A^-s we can construct an error bar on each element of Ry \ thus defining a confidence interval 
for each reconstructed value of R^f- As clear from Fig|6] a cluster of points around g(x) = x 19 is a good candidate for such 



setoff -s. Of course, considering many linearly independent g-s from a given functional family would yield a much better 

best and more confident error bars. 

i j 

These findings suggest that through the appropriate tuning of g, we can compensate for the "low quality" of time series, 
which can considerably improve the reconstruction precision, and even allow for estimation of A a- The question of selecting 
the optimal function g which extracts all the network information contained in the time series remains open. Most straightfor- 
wardly, one can search for such g via Monte Carlo method using many linearly independent functions. An intriguing result 
would be a way to design the optimal g based on the time series and interaction functions. On the other hand, the best g might 
be obtainable through techniques such as evolutionary optimization algorithms or machine learning. 

We finish the paper by discussing the limits and proposing further extensions of our method. Our strongest hypothesis 
is the precise knowledge of interaction functions. Despite the availability of good mathematical models for many natural 
interactions, lifting this assumption would greatly enhance the generality of our theory. When approximate functional forms 
are known, interaction functions can be expanded in series, facilitating their reconstruction. This would mean that for each g, 
we obtain not just R^' , but also and . This leads to a possibility of obtaining many different networks, all reproducing 
empirical data equally well, but in pair with different interaction functions. Another extension regards our assumption that the 
mathematical form of interactions is given by EqQ] While a similar theory could be developed for any known form of Eq[T] 
the problem arises for networks whose interactions form is not known. Furthermore, since noise is present in all physical 
processes and experimental measurements, our method should be applicable to noisy empirical data. Finally, we note that our 
problem of network reconstruction is similar to the problem of designing a network with prescribed dynamics. One can use 
our method to design a network that displays given time series, by specifying the tolerance in At- 
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